AI Challenger : A Large-scale Dataset for Going Deeper in Image Understanding
نویسندگان
چکیده
Significant progress has been achieved in Computer Vision by leveraging large-scale image datasets. However, large-scale datasets for complex Computer Vision tasks beyond classification are still limited. This paper proposed a large-scale dataset named AIC (AI Challenger) with three sub-datasets, human keypoint detection (HKD), large-scale attribute dataset (LAD) and image Chinese captioning (ICC). In this dataset, we annotate class labels (LAD), keypoint coordinate (HKD), bounding box (HKD and LAD), attribute (LAD) and caption (ICC). These rich annotations bridge the semantic gap between low-level images and high-level concepts. The proposed dataset is an effective benchmark to evaluate and improve different computational methods. In addition, for related tasks, others can also use our dataset as a new resource to pre-train their models.
منابع مشابه
Learning Document Image Features With SqueezeNet Convolutional Neural Network
The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...
متن کاملConvolutional Attention-based Seq2Seq Neural Network for End-to-End ASR
Traditional approach in artificial intelligence (AI) have been solving the problem that is difficult for human but relatively easy for computer if it could be formulated as mathematical rules or formal languages. However, their symbol, rule-based approach failed in the problem where human being solves intuitively like image recognition, natural language understanding and speech recognition. The...
متن کاملEvaluation of Updating Methods in Building Blocks Dataset
With the increasing use of spatial data in daily life, the production of this data from diverse information sources with different precision and scales has grown widely. Generating new data requires a great deal of time and money. Therefore, one solution is to reduce costs is to update the old data at different scales using new data (produced on a similar scale). One approach to updating data i...
متن کاملGoing Deeper with Convolutional Neural Network for Intelligent Transportation
Over last several decades, computer vision researchers have been devoted to findgood feature to solve different tasks, such as object recognition, object detection,object segmentation, activity recognition and so forth. Ideal features transform rawpixel intensity values to a representation in which these computer vision problemsare easier to solve. Recently, deep features from c...
متن کاملMultimodal Dialogs (MMD): A large-scale dataset for studying multimodal domain-aware conversations
Owing to the success of deep learning techniques for tasks such as Q/A and text-based dialog, there is an increasing demand for AI agents in several domains such as retail, travel, entertainment, etc. that can carry on multimodal conversations with humans employing both text and images within a dialog seamlessly. However, deep learning research is this area has been limited primarily due to the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1711.06475 شماره
صفحات -
تاریخ انتشار 2017